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Abstract 

Wc present a numerical method for computing optimal transition pathways 
and transition rates in systems of stochastic differential equations (SDEs). In 
particular, we compute the most probable transition path of stochastic equations 
by minimizing the effective action in a corresponding deterministic Hamiltonian 
system. The numerical method presented here involves using an iterative scheme 
for solving a two-point boundary value problem for the Hamiltonian system. We 
validate our method by applying it to both continuous stochastic systems, such 
as nonlinear oscillators governed by the Duffing equation, and finite discrete 
systems, such as epidemic problems, which are governed by a set of master 
equations. Furthermore, we demonstrate that this method is capable of dealing 
with stochastic systems of delay differential equations. 



1. Introduction 

One important aspect of the study of dynamical systems is the study of noise 
on the underlying deterministic dynamics [l|, Q • Although one might expect the 
deterministic dynamics would be only slightly perturbed in the presence of small 
noise, there are now many examples where noise causes a dramatic measurable 
change in behavior, such as noise induced switching between attractors in con- 
tinuous systems Q , and noise induced extinction in finite size systems 

In systems transitioning between coexisting stable states, much research has 
been done primarily because switching can be now investigated for a large vari- 
ety of well-controlled micro- and mesoscopic systems, such as trapped electrons 
and atoms, Jose phs on j unctions, and nano- and micro-mechanical oscillators 
0, S 0, 0, S BE, [H E El- In these systems, observed fluctuations are 



usually due to thermal or externally applied noise. However, as systems become 
smaller, an increasingly important role may be played also by non-Gaussian 
noise. It may come, for example, from one or a few two-state fluctuators hopping 
at random between the states, in which case the noise may be often described 



as a telegraph noise. It may also be induced by Poisson noise jl4| . 
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In finite size populations or systems, extinction occurs in discrete, finite 
populations undergoing stochastic effects due to random transitions or per- 
turbations. The origins of stochasticity may be internal to the system or 
may arise from the external environment |l5l . [lij , and in most cases is non- 
Gaussian [13, Ell- 

Extinction depends on the nature and strength of the noise [l9j . outbreak 
amplitude [l^l and seasonal phase occurrence [2lj. For large populations, the 
intensity of internal population noise is generally small. However, a rare, large 
fluctuation can occur with non-zero probability and the system may be able 
to reach the extinct state. Since the extinct state is absorbing due to effective 
stochastic forces, eventual extinction is guaranteed when there is no source of 
reintroduction [Ij, [H, [H . 

Models of finite populations, which include extinction processes, are effec- 
tively described using the master equation formalism, and predict probabilities 
of rare events [l7| . For many problems involving extinction in large populations, 
if the probability distribution of the population is quasi-stationary, the prob- 
ability of extinction is a function that decreases exponentially with increasing 
population size. The exponent in this function scales as a deterministic quantity 
called the action [^J] ■ It can be shown that a trajectory that brings the system 
to extinction is very likely to lie along a most probable path, called the optimal 
path. It is a major property that a deterministic quantity such as the action can 
predict the probability of extinction, which is inherently a stochastic process, 
and is also formulated in continuous systems driven by noise [l^ [2^ . 

Locating the optimal path is important since the quantity of interest, whether 
it is the switching or extinction rate, depends on the probability to traverse this 
path. Therefore, a stochastic control strategy based on the switching or extinc- 
tion rates can be determined by its effect on the optimal path [26 1. 

The optimal path formalism converts the entire stochastic problem to a 
mechanistic dynamical systems problem with definitive properties. First, the 
optimal path is a solution to a Hamiltonian dynamical system. In the case of 
continuous stochastic models, the dimension of the system is twice that of the 
original stochastic problem. The other dimensions are conjugate momenta, and 
typically represent the physical force of the noise which induces escape from a 
basin of attraction to either switch or go extinct. Finally, due to the symplectic 
structure of the resulting Hamiltonian system, it can be shown that both at- 
tractors and saddles of the original system become saddles of the Hamiltonian 
system. 

One of the main obstacles to finding the optimal path is that it is an in- 
herently unstable object. That is, if one one starts near the path described by 
the Hamiltonian system, then after a short time, the dynamics leaves the neigh- 
borhood of the path. In addition, although the path may be hyperbolic near 
the saddle points, it may not be hyperbolic along the rest of the path. Solving 
such problems using shooting methods for simple epidemic models [27, 28, 29j . 



or mixed shooting using forward and backward iteration |3d . |31[ will in general 
be inadequate to handle even the simplest unstable paths in higher dimensions. 
Therefore, it is the goal of this paper to exemplify a robust numerical method to 
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solve for the optimal path using a general accurate discrete formulation applied 
to the Hamiltonian two point boundary value problem. 

The method we employ here to compute the optimal paths is similar to 
the generalized minimum action method (gMAM], j32| which is a blend of the 
string method (ssj and minimum action method (34 | . Both are iterative methods 
which globally minimize the action along the path. These other techniques differ 
from ours primarily in that our formulation of the problem allows a direct, fully 
explicit iterative scheme, while the gMAM, in particular, employs a semi-implicit 
scheme. The numerical scheme presented here should provide an easy to employ 
alternative to the methods discussed above for stochastic optimization problems 
which are formulated as Hamiltonian two-point boundary value problems. 

The paper is organized as follows. We first briefly present the general SDE 
problem, and the formulation of the corresponding deterministic Hamiltonian 
system by treating the switching as a rare event. Wc then present the details 
of the numerical approximation technique we will use to find the path which 
maximize the probability of switching. Using this technique, we then demon- 
strate finding the optimal path from a stable focus to the unstable saddle for 
the unforced Duffing equation, then compute optimal extinction pathways for a 
simple epidemic model, and finally adapt our method to find optimal transition 
paths in stochastic delay differential systems. 

2. General Problem 

Consider a general stochastic differential equation of the form 



where x g K" represents the physical quantity in state space and and the matrix 
is given by G{x{t)) = diag{gi{t), g2{t), ...gn{t)}, where the gi's are general 
nonlinear functions. We suppose the noise ^ G M" is a vector having a Gaussian 
distribution with intensity D, and independent components. It is characterized 
by its probability density functional = e~^«/^. 



We wish to determine the path with the maximum probability of traveling from 
the initial state xa to the final state xb, where the initial and final states are 
equilibria of the noise- free (i.e. ^ = 0) version of Eq. [T] given by /(a;) = 0. These 
states typically characterize a generic problem of study in stochastic systems, 
such as switching between attractors, escape from a basin of attraction, or 
extinction of population. We assume the noise intensity D is sufficiently small 
so that in our analysis sample paths will limit on an optimal path as — >■ 0. 



'^Throughout the paper, boldfaec lower-ease letters will indieate vectors, while boldface 
upper-ease letters will indieate matrices. 



x{t)=f{x{t))+G{x{m{t) 



(1) 
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We also remark that ^ is formally the time derivative of a Brownian motion 



sometimes referred to as white noise 35 



For D sufficiently small, and examining the tail of the distribution for a large 
fluctuation (which is assumed to be a rare event), the probability of observing 



such a large fluctuation scales exponentially by |36l . |37| , 

T'. =e-^/^, R = mmn{x,i,p), (3) 

where, 

n{x,£,,p)=R^ + Jp-[x-f{x)-Gix{mi (4) 

where the Lagrange multipliers p also correspond to the conjugate momenta of 
the equivalent Hamilton- Jacobi formulation of this problemo The exponent R 
of Eq. |3] is called the action, and corresponds to the minimizcr of the action 
in the Hamilton- Jacobi formulation which occurs along the optimal path. This 
path will minimize the integral of Eq. 21 and is found by setting the variations 
along the path 6TZ to zero. 

The resulting equations of motion for the states and Lagrange multipliers 
are given by 

x = fix{t)) + G^{x)p 

dG dfjx) (5) 

Here [^{x)pp]i = [^{x)]ijk[p]j[p]k, where Einstein summation is assumed 
over repeated indices. Note that p = is invariant, and recovers the noise free 
case of Eq. [T] The above equations can be shown to satisfy the motion of a 
Hamiltonian system with Hamiltonian 

Hix^p)^^^^i^+p.fix). (6) 

That is, the dynamics of a given path satisfy x = ^^Qp'^'^ p = — ^^g^'^^ ■ 

In addition to solving the Hamiltonian system of dynamics, the full problem 
specification of an optimal path requires boundary conditions for both state x 
and momenta p. The boundary conditions of the optimal path consist of two 
steady states, Xa = {xa,Pa) and Xb = {xb^Pb) at equilibrium. Typically, the 
boundary conditions arc derived from the equations of motion defined by Eq. |S1 
Since they are derived as steady state conditions, they are asymptotic boundary 
conditions that are infinite limits in the temporal line. In addition, since we 
have assumed the Hamiltonian is time-independent, energy is conserved, and 
the path must lie on a fixed energy surface. Notice, in the case where the 
Hamiltonian is time invariant, and given that the action is minimized along the 
optimal path, the Hamilton- Jacobi equations require a zero-energy constraint, 
i.e. H{x,p) = 0. We now describe how to solve the Hamiltonian system as a 
two point boundary value problem on a restricted energy surface. 



The vector multiplication here is assumed to be an inner product. 
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2.1. Stability of Steady State Solutions 

As stated above, we seek the optimal path between two steady state solutions 
Xa = {xa,Pa) and Xb = {xb,Pb) where xa and xb are steady state solutions 
of the zero-noise case of Eq. [U and thus satisfy f{xA) = f{xB) = 0. Depending 
upon the linear stability of the zero-noise steady states, the stochastic transition 
from Xa to Xb may or may not be a rare event. For example, if the stability 
matrix of the general SDE Eq. [T] A = '^^dx^^ ^ f'i^A) has at least one 
eigenvalue with positive real part, then it is an unstable or saddle point in 
the stochastic equation. Further, if /'(a;^) has eigenvalues with all negative 
real parts, then it is a stable focus, and the transition from xa to xb will be 
deterministic restricted to p = and therefore not a rare event. The rare event, 
in this case, would be a transition from a;^ to xa, meaning noise drives the 
system out of the stable focus and onto the unstable/saddle point. 

It is worth noting that all steady state solutions of the Hamiltonian system 
[S] which correspond to the deterministic steady state solutions of Eq. [1] i.e. 
f{x) = 0, are saddle points. It is easy to see why this is true in the case of 
additive noise, i.e. G = I. We can classify the steady states of Eq. [5] by 
calculating the eigenvalues of its stability matrix Q 



Q = 



f'{xA,B) 


I 




A 


/ 


f"ixA,B)PA,B 


-f{xA^B)_ 










at the steady states. The eigenvalues 7^. of Q arc given by. 



A 


/ 








Azi + Z2 


= Ik 











Z2 




-AZ2 


Z2 



If we assume Z2 = 0, then Azi = "^k.Zi. Thus, if Afe e M is an eigenvalue of A, 
the stability matrix of the SDE problem given by Eq. [TJ then 74, — Afe is an 
eigenvalue of Q with eigenvector [2:1, 0]-^. Similarly, if 2:1 =0, then -Az2 = "fkZ^, 
which implies that 7^ = — Afc is an eigenvalue of Q with eigenvector [0, 22]^. 
Since ±Afe are eigenvalues of Q, we can conclude that every steady state solution 
of Eq. [S]has eigenvalues with both positive and negative real parts and, thus, 
is a saddle. Thus, every steady state solution whose linearization has non- 
zero real part of the general stochastic equation, Eq. [1] regardless of stability 
becomes a saddle point when the system is converted to a Hamiltonian system. 
That is, deterministic attractors and saddles map to saddles in the Hamiltonian 
formulation. 



3. Numerical Scheme 

Our numerical approach involves using a finite differences scheme to write 
the system of ODEs as a high dimensional algebraic system, to which we apply 
a modified Newton's Method to minimize the residual error until a solution is 
reached (to within some desired tolerance). As before, assume the Hamiltonian 
system in R^" admits two steady states, Xa and Xb- 
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Then we seek the optimal path on the zero-energy surface that connects Xa 
to Xb- In this formulation, one would expect such a path to exhibit several 
properties when parametrized along t € (—00,00). First, we assume the path 
starts at Xa at f = — 00. Since this point is an equilibrium solution, we should 
expect that the solution stays very near this value, that is, there exists e > 
such that \Xa ~ X{t)\ < e for —00 < t < — T^, has a transition region from 
— <t <T^ and finally stays near Xb^ the second steady state for, < t < 00. 
Numerically, we will approximate the solution on the finite domain —T^<t<T^ 
so that the value at X(±Te) is arbitrarily close to the steady solution. 

We map the interval [— T^, Tell onto [0, 1] using the linear transformation 
t = 2Tft — Te, and drop the "bar" notation for readability. On this discrete 
time domain, we write the system of ODEs as a system of nonlinear algebraic 
equations using central differences. 

The simplest method is to employ a finite step size h — 1/N and use a 
uniform time step subdividing [0, 1] into + 1 equal segments. In practice, 
however, the simple uniform step size is not always the best choice since the 
optimal path tends to stay very near the stable points throughout most of the 
domain, and sometimes makes a relatively sharp transition near the center of 
the domain. In this case, it is helpful to use a nonuniform grid to resolve the 
sharp transition region using a fine mesh, and to use a coarse mesh near the 
edges where the solution is mostly flat. Thus, for the nonuniform time step hk, 
yielding the time series t^+i = tk + hk and corresponding function values Xk, 
the derivative is approximated by the operator (5/i, 

d hl_^Xk+i + {hi - hl_^)xk - hlxk-i 

—Xk ~ ShXk = ,2 , , ,2 • 9) 

at hk-ihl + hkhl_^ 

At this point, we can write the generic system of 2n{N + 1) nonlinear alge- 
braic equations: 



OhXk T\ =0 ^hPk^ = 0, fc = 0, 1, A*, (lUj 

op ox 

and solve this system using a general Newton's Method for nonlinear systems of 
equations. To properly apply Newton's method here, let = {xi^j...XN.jTPi.j---PN,j}'^ 
be the extended vector of dimension 1 x 2nN containing the j-th Newton iter- 
ate (recalling that Xk.j,Pk,j & are defined on the timeseries given by fc = 
0, 1, A^). Then j = will represent the initial guess. Let T : 7^2"^ TZ'^"^ 
be the function defined by Eq. [10] acting on qj . Then to find the zeros of 
J^{q) we employ a Newton scheme. A new Newton iterate is given by solving 
the linear system Jj^{qj){qj+i — qj) = —J-{qj), using any one of a variety of 
methods such as LU decomposition or the generalized minimal residual method 



■^For the simulations below, \Tc\ > 100 unless otherwise specified. Ideally, is picked large 
enough so that \xa — Te)| < 10~^® and \xb — '"(Te)! < 10~^®. i.e. the steady states are 
obtained up to machine precision for double precision numbers. 



6 



(GMRES) with appropriate preconditioners. Throughout this paper we will 
use LU decomposition with partial pivots optimized for a sparse linear system. 
Here the Jacobian Jj^{qn) is computed approximately using a central difference 
scheme. 

Formally, this method is second order with respect to h^. The initial guess 
for this algorithm is constructed by the knowledge that the optimal path spends 
most of its time near the stable equilibria, and has a brief but sometimes sharp 
transition between the two states. One choice that has worked in practice is 
using functions like q^fc = {xA^XB)/{l + e'"^'') + XB^ with fc = 0, 1, ...N (where 
the C > parameter adjusts the sharpness of the jump), which have horizontal 
asymptotes at the appropriate critical values. Usually, though not always, qg.fc 
is set so that for A: = + 1, + 2, ...2N, qo,fe = 0. 

Note that the zero-energy surface constraint H{x,p) = is not imposed. 
Rather, the initial guess will start out near Xa (at t = 0) and Xb (at t = 1) 
which lie asymptotically close to the zero energy surface, and thus the final 
solution will have to lie on this surface since such a solution is time invariant (i.e. 
d/dtH{x,p) = 0). At each iterate, both the residual error and the Hamiltonian 
are checked at each point, and both must reach a desired tolerance in the Coo 
norm before the procedure is completed. 

Once the optimal path is computed, the action (i.e. the exponent) along the 
optimal path may be obtained with a simple integral, 



Exhaustive convergence tests on the residual error for a variety of test prob- 
lems for both the uniform and non-uniform grids have demonstrated the second 
order convergence for this method. We have noted some dependency on the 
initial guess in terms of the overall speed of convergence (or divergence for a 
particularly bad guess). The method generally produces a unique solution (up 
to possibly a horizontal shift in the time series seen in a few examples, which 
does not affect the path integrals of interest). This method has been reliable 
for a wide parameter regime for each of the test problems, with the limitations 
to be discussed below on a case by case basis. 

This method, which we will henceforth refer to as the Iterative Action Mini- 
mization Method (lAMM) , has several distinct advantages over other methods, 
the foremost of which is straightforward scalability to higher dimensions. For 
very high dimensional problems, the systems will eventually become too large to 
treat easily with a single processor, but this algorithm has proven efficient for up 
to six dimensional problems with a single processor. Further, this method lends 
itself to infinite dimensional problems, such as time delay stochastic differential 
equations, as will be demonstrated below. 

4. Noise Induced Transitions 

We demonstrate the numerical techniques by examining several bistable dy- 
namical systems. Using methods discussed above, we explicitly approximate 




1 



dx{t) 
dt 




(11) 
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the optimal path between the two states and then numericahy integrate along 
the path directly to compute the action. 

Switching in the Duffing Equation 

One of the standard nonlinear dynamical systems which exhibits bi-stability 
is DufSng's equation. This equation is used to model certain types of nonlinear 
damped oscillators, and here we consider the singularly perturbed and unforced 
version [38j . 

i = y + gi{x,y)(.i{t), 

ey = ax - /3x^ ~ dy + g2{x,y)^2{t) (12) 
e = 0, 

Here a and (3 control the size and nonlinear response of the restoring force, 
while 6 controls the friction or damping on the system. The terms and ^2 
are uncorrelated white noise sources applied to the acceleration and velocity 
respectively. When the perturbation e ^ 1 fast and slow manifolds can be 
identified, while e = 1 gives the unconstrained case. Rescaling time t' = {l/e)t, 
applying eq. [SI and following the methodology above, we can write the system 
in the following general form, 

H{x, y,Px,Py) = f^l^M! + + ep.,y+py{ax - I3x' ~ 8y) = 0, (13) 

with corresponding equations of motion, 

i = ^{Pxgiixf +y) 
y = Py92{yY + ax~ I3x^ - 5y 
Px = -<^pl9i{x)g[{x) - py{a - 3(3x'^) 

Py = -pl92{y)g'2{y) - <^Px + 5py. 

We will restrict our focus to the additive white noise case, giix, y) = g2{x, y) = 
1. 

Note that the Hamiltonian system emits three known steady states Xa = 
(0,0,0,0) and Xb = (±-y/a//3, 0, 0, 0), all of which are saddle points. These 
steady states correspond to zero-noise critical points of Eq. [121 = (0,0), a 
saddle point, and xb = {±y/a/l3, 0), the centers of the stable foci. The path 
from XA to Xb can be found deterministically when Px = Py = 0, as any solution 
perturbed from the saddle node point will move along the solution curves and 
end up at cither stable focus. A more interesting case is the optimal path from 
one of the focus points to the saddle-node point, which will require non-trivial 
momentum. Such momenta model the small noise effects which organize to force 
the trajectory across the basin of attraction, thus escaping from one attractor 
to the other. 

Figure [1] shows both the deterministic and optimal paths as computed using 
the lAMM developed above. For the deterministic path, the algorithm correctly 
predicts that the noise will be zero along this path, and that the action will be 
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(14) 



zero as the probability of going from Xa to Xb is one, and thus not a rare 
event. On the other hand, the path from Xb to Xa involves nontrivial action, 
and the effect of the noise along the optimal path is shown. This path lies on the 
zero-energy surface, and maximizes the probability of traveling from Xb to Xa 
for arbitrarily small noise intensities, D. Figure [5] shows the residual error at 
each iterate used to generate the data for figure [T] until the convergence criteria 
is reached. 




X ■ ■ Px 

(a) (b) 

Figure 1: Projections of the optimal path (a) onto phase space and (b) onto momenta space. 
In (a) we show both the optimal (from {\J aj fi, 0) to (0,0), labeled with a solid lino) and 
deterministic path (from (0,0) to (y^a//?, 0), labeled with a dashed line) as predicted by our 
numerical method. Along the deterministic path, the action is zero and pa, and are zero, 
but the corresponding noise components along the optimal path are shown in (b). Here e = 1, 
^ = .25, 5 = 1, and a = 1. 




02468 02468 
Iterate Iterate 



(a) (b) 

Figure 2: The maximal residual error (a) and the Hamiltonian (b) of the numerical scheme 
as a function of iterate, where the initial guess is iterate 0. This figure shows the convergence 
for the data shown in Figure [T] where the parameter values are e = 1, /3 = .25, 5 = 1, and 
o = 1, and the error threshold is met after the ninth iteration. The maximal error is defined 
to be the maximum residue of all the time series for all the components. 

When e <g; 1 , we can derive an analytical formulation for the action by using 
a center manifold analysis on Eq. [T3] by following the work of [s!] . Here the 
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center manifold is given by y = h(x, e), and we approximate it as, 



y = h{x, e) = /io(x) + eh^{x) + ©(e^). (15) 

By substituting equation [15] into equation [T^] and equating like powers of e, 
we arrive at a one dimensional form of equation [T^] for the lowest order terms 
involving e, 

i = ^ (ax - I3x^ + e(t)) . (16) 

Here the contribution of the uncorrelated noise terms ^1.2 are contained in 
a single noise source ^. The Hamiltonian form of 1161 is. 

2 

H{x,p) = ^+p-^{ax-px''). (17) 

We can find the nontrivial relationship (p 7^ 0)between p and x on the zero 
energy surface directly, p = — '^{ax — jSx^) , and integrate along this path to 
predict the action along the optimal path. Since one may be interested in how 
the action scales relative to the distance between the two critical points we 
substitute the values = a and &^ = -g to better illustrate the scaling with 
respect to /?. Thus, with this substitution, we seek the action from x = a& to 
a; = 0. From equation [TTJ this is a simple integral. 

Thus, we can predict, for example, that the action from [ah, 0) to (0, 0) should 
scale like the square of h near the center manifold. Indeed, varying the parameter 
h for the four-dimensional system, Eq. 1141 predicts the same order scaling as 
seen below in Fig. [3l We also consider the scaling with respect to the damping 
parameter (5, and again, near the center manifold, the predictions are born out by 
integrating along the optimal path. Indeed, the scaling for the action predicted 
from the lowest order terms in the center manifold analysis seems to persist in 
the two-dimensional model. 

Since it is not clear if the same relationship will work further away from 
the center manifold, we check the scaling with respect to h and 5 when e = 1 
in EqlU in figure H) Interestingly, the leading order scaling of the action with 
respect to jS is still 2, just as it was near the center manifold. Meanwhile, the 
scaling with respect to 5 is markedly different. Thus, we can predict both near 
and away from the center manifold how the action will scale as a function of 
the distance between the two equilibrium points. 



4-2. SIS Epidemic Models 

Next we consider a simple susceptible-infected-susceptible (SIS) epidemic 
model in the general noise case, as considered by [25|. This model is defined 
by a master equation which describes the probability of fiuctuations between a 
susceptible or infected category in a population of TV individuals. Assuming TV 
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3.5 4 4.5 5 -0.8 -0.6 -0.4 -0.2 0.2 

ln(b) ln(5) 

(a) (b) 

Figure 3: The scaling of the action as a function of the parameters when e = .05. For the two 
sweeps, the action is obtained by using the lAMM on Eq. I14l with a = 1 and 5 = 1 while b is 
swept in (a) and with a = 1 and 6 = 20 while 5 is varied in (b). Since e is small, we expect 
the center manifold approximation of Eq. llSl to be valid. Indeed, the linear best fit in (a) is 
In(iJ) = 1.9989 In(fe) - 2.8417 while in (b) the equation is ln(i?) = -.958666 ln(<5) 4- 2.3008. 
The slope of both of these lines are close to the prediction given by equation 1181 




Figure 4: The scaling of the action as a function of the parameters when e = 1. For the two 
sweeps, the nonlinear term b is swept for fixed 5 = 1 in (a) while fe = 20 is fixed while S is 
varied in (b). The linear best fit in (a) is ln(i?) = 21n(6) — 1.655 while in (b) the power law 
scaling vanishes as S is increased. The second order scaling of the action with respect to b is 
persistent even away from the center manifold. 



is sufBciently large, we can write a mean field system of equations for the change 
of the population fractions of susceptible and infected individuals, denoted xs 
and xj respectively, 

is = fJ.- lixsxi + KXi - nxs 

a ' ^ 

Xj = pXsXj — KXj — fJiXi- 

For simplicity, the population in the mean field is assumed constant, i.e. births 
and deaths are equal. Here ^ is the natural birth and death rate of both the 
susceptible and infected populations, p is the contact rate, and k is the natural 
recovery rate of the infected population. 

Since we have assumed the population fraction of susceptible and infected 
individuals are conserved, we have xi = \ — xs- Under this constraint, assum- 
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ing a small random fluctuation of only the infected individuals, we can use a 
methodology similar to the one described above (and worked out in detail in 
25[) to write a Hamiltonian system, 



H{xj,pi) = I3xi{l - xi){eP' - 1) + (Ai + >i)xi{e-P' - 1). (20) 

We shall refer to equation EOl as the ID SIS model equation. The optimal path 
will extend from the endemic state at a;/ = 1 — l/i?o to the extinct state xj = 0, 
where i?o = /^/(a* + '^)- Using the methods introduced above, the optimal path 
(at typical parameter values) is given below. Integrating the momentum along 
this path gives the action, which is proportional to the probability of extinction. 
In this simple ID SIS model, the predicted action scaling as a function of Rq is 
given by solving the Hamiltonian directly for pi and solving the integral [TT] 

[ -ln[i?o(l-a;/)]c;a;. 

Since this is one of the rare cases the optimal path can be found analytically, 
it is a great test case for our method. A comparison of the analytical action to 
the numerical action is shown in figure [S] 




Figure 5: The action as a function of Rq as predicted by an analytical expression and our 
numerical method. Here, k = 100, and fi = .2 while 13 is varied. 



In the case of an SIS epidemic model with independent fluctuations on both 
the susceptible {xs) and infected (x/) populations, the Hamiltonian form of the 
equations is given by, 

H{xs,xi,ps,Pi)= fiieP'~l)+px,xiie-P''+P'-l)+KXiieP'-P'-l)+fixs{e-P'-l)+fixiie-P'^l), 

(21) 

and we refer to this form as the 2D SIS model. The two states of interest for this 
system are the endemic state, {l/Rg,! — 1/ Rq,0,0) and the nontrivial extinct 
state, (1, 0, 0, ln(l/i?o))- Using the procedure discussed above, we compute the 
optimal path from the endemic to the extinct state, and show a typical result 
in figure [6l 

Instead of comparing the action scaling predicted here to an analytical for- 
mulation, we instead compare it to a Monte-Carlo simulation of the master 
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Figure 6: Projection of the optimal path onto (a) population space and (b) momenta space 
for a sample SIS system with /3 = 104, k = 100, and /i = .2. 



equation for the initial system with a fixed population size of 20,000 individu- 
als. In the Monte-Carlo simulation, a Gillespie algorithm is employed on the 
SIS model, and from all the simulations, a probability of extinction is computed 
for several values of Rq. From these, a mean extinction time is derived, and the 
log of this mean extinction time should scale like the action predicted from the 
action integrals, from Eq. computed using our approach. Figure [7] shows a 
comparison of the two approaches, and good agreement is seen between these 
two independent methods. 




Figure 7: A comparison of the action predicted by computing the optimal path to a Montc- 
Carlo simulation of the original system. Here, /3 is varied while k = 100 and /i = .2. 



4-3. Finite Time Lyapunov Exponents 

As demonstrated in [l7|, the optimal path to extinction coincides with ridges, 
i.e. maximal values, of the finite time Lyapunov exponents (FTLEs). Forgoston 
and others propose finding FTLE ridges as a method of computing optimal 
paths 17, 3, 331 . Here, we demonstrate that our approximation to the optimal 
path does indeed locally maximize the FTLE. 

We proceed by using the methods outlined in 4^, 4l|, IH, to approximate 
the FTLEs at points on our optimal path, and at nearby points transverse to 
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the optimal path. We begin by picking a point on the optimal path (generated 
by our method), and on nearby points some small distance away from the path. 

For the given vector field, we assume we have a flow passing through initial 
point xo, : 7^" ^ n'\ such that (?!)*°+^(a;o) = x{to + T;to,xo). The local 

linear variation at (xq) is defined by A{xQ,to +T) ~ — ''^^Bx^ ■ 

Using a fourth order Runge-Kutta method, we can integrate all the initial 
points , Xq, forward in time over a fixed interval, and compute the finite time 
deformation rate of the local coordinates (i.e. Right Cauchy-Green Tensor) 
C{xo) = A'^{xo, to + T)A{xo, to + T). The maximal eigenvalue Xmax of C{xo) 
will give the FTLE a{y, i„ T) = i In yO^. 

Consider the case of the ID SIS extinction model from above. Here, we will 
use a path computed above and compare the values near this path to the local 
FTLEs. In Figure [8^ we plot the FTLEs over a square domain, and show that a 
local maximum (ridge) is attained precisely where the computed optimal path 
predicts. 



0.1 0.2 0.3 0.4 0.5 
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(b) 



Figure 8: Figure (a) shows the numerically computed optimal path overlaid with the FTLE for 
the one-dimensional SIS model. The optimal path (dashed line) occurs precisely along a ridge 
of the FTLE. Figure (b) shows a single slice of the FTLE computation for the two-dimensional 
SIS model, where r is the distance from the optimal path in the transverse direction along 
a unit vector, and —r represented a distance in the antipodal direction. For both paths, the 
parameter values from Figure [6] are used. 



In higher dimensions, the maximal Lyapunov exponent is still exhibited 



along the optimal path [17|. For the 2D SIS model, note that the optimal 



path exists in four dimensions. Thus, the transverse direction is the set of 
points obtained by rotating a normal vector to the path around two Euler an- 
gles (which forms a 3D sphere for a given radius r). To illustrate FTLEs in 
this higher dimension framework, we must consider the "shell" around the ini- 
tial starting point (and orthogonal to the path) at a fixed radius, and then 
compute FTLEs on this shell all along the optimal path. We expect that the 
maximal FTLE will occur along the optimal path, relative to nearby points 
on the transverse sphere. To illustrate this, we define uj_(t) as a unit vec- 
tor orthogonal to the tangent vector of the optimal path at a given time, i.e. 
u±{t) = {xi{t),-xs{t),0,0)/\{xi{t), ~xs{t), 0,0)\, and then examine the FTLE 



14 



as a function of the two Euler angles for a given r. Figure shows just one 
cross section (over a short time interval), obtained by setting both rotation an- 
gles to either or the anitipodal angle tt, of this high dimensional object, as a 
function of r, and demonstrates that the maximal FTLE is, indeed, along the 
optimal path. 



5. Time Delayed SDEs 

One advantage of the lAMM is that it allows the solution of stochastic delay- 
differential equations of the form, 

i(t) = f{x{t), x{t - r)) + G{x{tmt)- (22) 

Schwartz et. al [i^l have demonstrated that the methodology introduced in 
section 2 can be adapted to write this system as a Hamiltonian system, 

H{x, x^,p) = P +p. fix, xr), (23) 

where x^ = x{t — t). The equations of motion are given by, 



X = —(X,Xr,P) 

P= -^(a;,a;r,p) - ^ix{t + T),x{t),p{t + t)). 



(24) 



Note the appearance of both delay and advance terms in [24l Because of the 
appearance of the delay term, the Hamiltonian is no longer time invariant, and 
unlike in the previous examples, where H{x,p) — 0, the zero-energy condition 
is not conserved. 

We shall consider a one dimensional test case where f{x, Xr) = x{l — x)—jXr, 
where the steady states are given by xa = 1 — 7 and xb = 0, for < 7 < 1. 
Again, we will assume additive noise G = 1, and derive the Hamiltonian system, 

1,2 

i?(a;,a;^,p) = y +p(a:(l - .t) - 7x^), (25) 
and the corresponding equations of motion, 

X^x{l-x)--/Xr+p ^^g^ 

p = — p(l — 2x) + jp{t + t). 

The lAMM needs only a few minor adjustments to compute these paths numeri- 
cally. Primarily, the presence of the delay and advance terms will add additional 
entries into our linear system Eq. 1101 Our method uses a non-uniform timestep, 
and so x{tk — t) may not coincide exactly with one of our points Xk at time 
tk- To overcome this, we can use Lagrange interpolation on the closest four 
points Xj-2TXj-i,Xj,Xj+i such that tj-2 < tj-i < tk — t < tj < tj+i. Since 
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we keep the time domain fixed the j needed for each tk can be easily computed 
before the iterative scheme is started, and fewer or more terms can be used in 
the Lagrange interpolation scheme depending upon desired accuracy. Further, 
if tfc — T < or tfc + r > Tg. i.e. the delay or advance terms fall outside of our 
numerical domain, then we can set Xk = xa oi Xk = xb respectively. 

To demonstrate the effectiveness of the lAMM in solving these stochastic 
delay problems, we show sample paths in figure |9] and compare the scaling of 
the action versus a Monte-Carlo simulation of the stochastic delay difference 
equation in figure [TOl and note the good agreement. 





Figure 10: Scaling of the action along the optimal path as a function of 7. Here the solid line 
indicates the lAMM predication, and the circles represent a Monte-Carlo simulation of 1000 
runs. 



6. Discussion 

Wc have considered the problem of finding the trajectory in stochastic dy- 
namical systems that optimizes the probability of switching between two states, 
or causes one or more components to go extinct. In computing such a tra- 
jectory, called the optimal path, we needed to consider a numerical technique 
which could solve a Hamiltonian system with asymptotic boundary conditions 
in time. We have developed a numerical method, which wc call the iterative 
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action minimizing method (lAMM), for finding the optimal path of transition 
between two steady states in stochastic dynamical systems. This method is 
ideal for systems which can be written as two-point boundary value problems 
governed by Hamiltonian systems. We have validated the lAMM by presenting 
a variety of problems of interest, and have compared the numerical results with 
either analytic results or Monte-Carlo simulations of full stochastic systems. 

As demonstrated here, the lAMM method is robust enough to be applicable 
to a variety of different types of problems, including continuous SDE systems, 
such as the Duffing equation, discrete epidemic models of finite population size, 
such as the SIS model, and stochastic delay differential equations, in which the 
deterministic problem is infinite dimensional. The methodology is straightfor- 
ward enough to generalize to higher dimensions, in contrast to other commonly 
used methods, such as the shooting method, which is a major advantage of the 
lAMM. 

The primary limitations of this method are scaling issues in very high state 
space dimensions, and the finesse required in picking an initial guess that guaran- 
tees convergence, both of which are typical of iterative methods of quasi-Newton 
type. In the limit of small noise or large system size, however, due to the ro- 
bustness and ease of generalization to complex and high dimensional dynamical 
systems, the method offers a considerable advantage over simulating large sys- 
tems, or systems which require many Monte Carlo runs to generate statistics 
of the transitions paths. As a result, we expect this method will be useful in 
efficiently solving a large variety of optimal transition problems in the field of 
stochastic dynamical systems. 
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